Orthogonal Principal Feature Selection

نویسندگان

  • Ying Cui
  • Jennifer G. Dy
چکیده

This paper presents a feature selection method based on the popular transformation approach: principal component analysis (PCA). It is popular because it finds the optimal solution to several objective functions (including maximum variance and minimum sum-squared-error), and also because it provides an orthogonal basis solution. However, PCA as a dimensionality reduction algorithm do not explicitly indicate which variables are important. We propose a novel method that utilizes the PCA result to select the original features, which are most correlated to the principal components and are as uncorrelated with each other as possible through orthogonalization. Our feature selection method, as a consequence of orthogonalization, preserves the special property in PCA that the retained variance can be expressed as the sum of orthogonal feature variances that are kept. Our experiments show that orthogonal feature selection, leads to better performance compared to without orthogonalization, and for a fixed number of retained features, consistently picks the best subset of features in terms of sum-squared-error compared to competing methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ropls: PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data

4 Hands-on 3 4.1 Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4.2 Principal Component Analysis (PCA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.3 Partial least-squares: PLS and PLS-DA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.4 Orthogonal partial least square...

متن کامل

Comparative Analysis of Wavelet-based Feature Extraction for Intramuscular EMG Signal Decomposition

Background: Electromyographic (EMG) signal decomposition is the process by which an EMG signal is decomposed into its constituent motor unit potential trains (MUPTs). A major step in EMG decomposition is feature extraction in which each detected motor unit potential (MUP) is represented by a feature vector. As with any other pattern recognition system, feature extraction has a significant impac...

متن کامل

Performance of Principal Component Analysis and Orthogonal Least Square on Optimized Feature Set in Classifying Asphyxiated Infant Cry Using Support Vector Machine

Received Aug 26, 2017 Revised Nov 2, 2017 Accepted Nov 20, 2017 An investigation into optimized support vector machine (SVM) integrated with principal component analysis (PCA) and orthogonal least square (OLS) in classifying asphyxiated infant cry was performed in this study. Three approaches were used in the classification; SVM, PCA-SVM, and OLSSVM. Various numbers of features extracted from M...

متن کامل

Feature selection using genetic algorithm for classification of schizophrenia using fMRI data

In this paper we propose a new method for classification of subjects into schizophrenia and control groups using functional magnetic resonance imaging (fMRI) data. In the preprocessing step, the number of fMRI time points is reduced using principal component analysis (PCA). Then, independent component analysis (ICA) is used for further data analysis. It estimates independent components (ICs) of...

متن کامل

A new approach for HIV-1 protease cleavage site prediction combined with feature selection

Acquired immunodeficiency syndrome (AIDS) is a fatal disease which highly threatens the health of human being. Human immunodeficiency virus (HIV) is the pathogeny for this disease. Investigating HIV-1 protease cleavage sites can help researchers find or develop protease inhibitors which can restrain the replication of HIV-1, thus resisting AIDS. Feature selection is a new approach for solving t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008